SIP*tfMW  Of 

mmm  wmssm  * 
siMfORO,  myfosiei 


OPTIMAL  LINEAR  ESTIMATION  OF  BOUNDS  OF  RANDOM  VARIABLES 

BY 

PETER  COOKE 


TECHNICAL  REPORT  NO.  37 
SEPTEMBER  24,  1979 


PREPARED  UNDER  GRANT 
DAAG29-77-G-0031 

FOR  THE  U.S.  ARMY  RESEARCH  OFFICE 


Reproduction  in  Whole  or  in  Part  is  Permitted 
for  any  purpose  of  the  United  States  Government 

Approved  for  public  release;  distribution  unlimited. 


DEPARTMENT  OF  STATISTICS 
STANFORD  UNIVERSITY 
STANFORD,  CALIFORNIA 


Optimal  Linear  Estimation  of  Bounds  of  Random  Variables 

BST 

Peter  Cooke 


TECHNICAL  REPORT  NO.  37 
September  24 ,  1979 


Prepared  under  Grant  DAAG29-77 -G-0O31 
For  the  U. S.  Army  Research  Office 

Herbert  Solomon,  Project  Director 


Approved  for  public  releasej  distribution  unlimited. 


DEPARTMENT  OF  STATISTICS 
STANFORD  UNIVERSITY 
STANFORD,  CALIFORNIA 


Partially  supported  under  Office  of  Naval  Research  Contract  N00014-76-C-0475 
(NR-042-267 )  and  issued  as  Technical  Report  No.  276. 


The  findings  in  this  report  are  not  to 
be  construed  as  an  official  Department 
of  the  Army  position,  unless  so 
designated  by  other  authorized  documents. 


Optimal  Linear  Estimation  of  Bounds  of  Random  Variables 


By 

Peter  Cooke 


1.  Introduction. 

Suppose  X^,X^, . . . ,X^  are  independent  random  variables,  each  with 
density  f(x)  and  distribution  function  F(x),  where  F(x)  e  (0,l) 
only  for  x  e  (9,0).  Let  Y^  <  Y^  <  • • •  <  Yn  denote  the  order 
statistics  based  on  The  parameter  0  is  known  to  be 

finite  and  is  the  parameter  of  interest.  The  large  sample  inference 
for  0  which  follows  applies  whether  or  not  9  is  known,  though  when 
9  =  -oo  we  will  need  to  assume  that  the  X' s  have  finite  second  moment 
since  our  estimators  are  linear  functions  of  the  order  statistics  and 
we  will  compare  estimators  through  their  mean  squared  errors. 

When  only  the  r  largest  observations  are  used  to  estimate  0,  a 
linear  estimator  is  of  the  form 


(1) 


n,r 


-  I 

i=l 


aiYn-i+l 


In  section  2  we  will  show,  for  fixed  r  >  2,  how  the  coefficients 
a^a^,  ...,ar  can  be  determined  so  as  to  yield  the  estimator  of  the 
form  (l)  with  asymptotically  smallest  mean  squared  error. 
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2.  Determination  of  the  Coefficients. 

It  is  clear  that  since  we  are  discussing  large  sample  theory  and 
the  parameter  of  interest  is  the  upper  endpoint  of  the  distribution  and 
also,  since  we  are  basing  our  inference  on  the  largest  few  observations, 
from  a  practical  point  of  view  we  don't  need  to  know  the  form  of  f, 
but  we  need  to  characterize  the  shape  of  its  upper  tail.  Thus,  as  in 
Cooke  (1979)  we  will  consider  the  case  in  which 

l/v 

(2)  Fn(y)  ~  exp{-(~^-)  )  as  n  -*■  00  , 

n 

for  which  Gnedenko's  (19^3 )  necessary  and  sufficient  condition  is  that 
for  c  >  0, 


lim 

y-*  0- 


l-Ffcy+0 ) 
1-F  (y+0 ) 


,  where  u  =  F  1  (l  -  — )  . 
3  n  s  n3 


The  value  v  =  1  corresponds  to  densities  f(x)  which  are  truncated 
at  0;  that  is,  0  <  f  (0  )  <  00.  In  general,  v  =  l/ (k+l)  for  a  density 
which  is  zero  or  infinite  at  0  and  whose  first  finite,  nonzero  left 
derivative  at  0  is  its  k^1  left  derivative. 

It  is  proved  in  Cooke  (1979)  that,  when  Gnedenko's  condition  holds 
and  n  00,  for  i  >  1  and  i  small 

(3)  E<In-i+i)  ~  e-<e-un) 


and,  for  i  >  j  >  1, 
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(4) 


^(Vi-KL’Wi*  ~  ^  -  T^rn11  - 


where  r (oc)  is  the  familiar  gamma  function  defined  by 


So 


r(a)  =  /  e"x  xa-1dx  for  a  >  o 


It  follows  from  (3 )  that  as  n  ■*  oo 


(5) 


E(e 


r (v+i ) 


n>r-e)~ea 


p  A 

When  V  a.  =1,  -which  we  require  for  G 

i=l  1  n,] 


to  be  a  consistent 


estimator  of  6 ,  using  (k)  and  (5)  we  find,  when  n  -*  oo, 


(6) 


<*-v2  -  &  i  v3 


r  r 


r(2v+i)r(v+,i) 


The  quadratic  form  on  the  right  in  (6)  can  be  written  as  a1  A  a, 

<*s/  <v 

where  a  is  a  column  vector  with  elements  a^>ap,...,ar  and  A  is  a 
symmetric  r  x  r  matrix  with  (ijj)^  element 


-  r  (2v+i  )r  (v+,j ) 


\  \  r  —  /*  V  /  A  s'  A 

xij  "  r(v+i )r(j )  >  3  ^  1  * 


If  we  let  1  denote  the  r  x  1  vector  with  each  element  equal  to 
1,  our  problem  reduces  to  finding  the  vector  a  which  minimizes  a’  A  a 

/Nrf  /v 

subject  to  a’l  =  1.  The  minimization  is  achieved  by  the  vector 

rsj  /V 

a  =  (l*  A  1l)  "'“A  ■*"  1  and  the  minimum  value  of  a*  A  a  is  (l1  A  1l)  1  . 

IN/  <V  /V  (V  (V  /V 


3.  Computations . 

In  the  tables  to  follow  we  have,  correct  to  three  decimal  places, 

values  of  the  coefficients  of  the  r  largest  order  statistics  for  the 

minimum  mean  squared  error  estimator  of  0,  which  henceforth  we  denote 

by  0  •  Also  tabulated  are  some  values  of  7  (v)  =  lim  (e-u  )~  E(e  -0)  . 

^  n,r  r  N  n  n,r 

n  00 

The  truncation  case  v  =  1  is  probably  the  most  important  case  from 

a  practical  point  of  view,  but  is  singled  out  here  in  view  of  the  special 

nature  of  the  minimizing  coefficients  a^,a^, . . . Gnedenko's  condition 

suggests  that  for  y  close  to  0,  1-F(y)  «  (0- y)1^  and  hence,  idien 

v  =  1,  that  F(y)  is  linear  in  y  for  y  near  0.  This  corresponds 

to  a  Uniform  distribution.  If  indeed  Y  ,n,Y  Y  are  the  r 

largest  order  statistics  from  a  Uniform  distribution  with  upper  endpoint 

0  and  Y,,Y„,...,Y  are  ignored,  then  Y  and  Y  are  jointly 

12  n-r  n-r+1  n 

sufficient  for  0,  in  which  case  it  follows  that  the  minimum  mean  squared 

error  estimator  of  0  will  be  a  linear  function  of  Y  and  Y 

n-r+1  n 

alone.  Thus  =  •  •  •  =arp=c>  *en  V  =  1.  The  increasing 

dependence  on  Yn  r+2’^n  r_^> •• * >Yn  ^  with  decreasing  v  or,  equivalently, 
increasing  power  of  (0-y),  is  apparent  from  the  tables  to  follow. 

Using  (6 )  with  v  =  1  and  a2  =  a^  =  **-=ar^  =  0we  easily  find 
that  the  minimizing  coefficients  are  a^  =  1+r  a^  =  -r  1  and  that 
the  minimum  value  of  7  (l)  is  l+r~\  It  follows  that  7r(l)  cannot 
be  smaller  than  1  for  any  r  >  1  and  that  a  nearly  optimal  estimator 
is  obtained  with  a  fairly  small  value  of  r. 
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Table  1 


Minimizing  Coefficients  and  Asymptotic  Mean  Squared  Error  of 

the  Optimal  Estimator. 


v  =  1/2 


r 

»l 

®2 

°3 

a4 

a5 

®6 

*7 

7r(l/2) 

2 

2 

-1 

.667 

3 

I.636 

.273 

-.909 

.545 

4 

1.140 

.240 

.160 

-.840 

.480 

5 

1.314 

.219 

.146 

.109 

00 

00 

• 

1 

CO 

S' 

• 

6 

1.224 

.204 

.13  6 

.102 

.082 

CO 

t- 

• 

1 

.408 

7 

1.157 

.193 

.129 

.096 

.077 

.064 

-.716 

.386 

v  =  1/3 


r 

®1 

a2 

“3 

a4 

a5 

a6 

*7 

7r(l/3) 

2 

2.5 

-1.5 

.564 

3 

1.951 

.585 

-1.537 

,44o 

4 

1.654 

.496 

•372 

-1. 523 

.373 

5 

1.463 

•  439 

•329 

.269 

-1.501 

.330 

6 

1.328 

.398 

.299 

.244 

.210 

-1.479 

.300 

7 

1.226 

.368 

.276 

.226 

.193 

.171 

-1.459 

.277 

v  =  l/4 


r 

a2 

*3 

a4 

a£> 

a6 

*7 

7r (1/4 ) 

2 

3 

-2 

.532 

3 

2.273 

.909 

-2.182 

.403 

4 

1.882 

•  753 

.602 

-2.237 

.334 

5 

1.632 

.653 

.522 

.448 

-2.255 

.289 

6 

1.456 

.583 

.466 

•399 

.355 

-2.260 

.258 

7 

1.325 

•  530 

.424 

.363 

.323 

.294 

-2.259 

.235 
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Table  1  (Continued) 


Minimizing  Coefficients  and  Asymptotic  Mean  Squared  Error  of 

the  Optimal  Estimator. 


v  -  1/5 


r 

*1 

a2 

*3 

a4 

a5 

a6  ®7 

7r(l/5) 

2 

3.5 

-2.5 

.518 

3 

2.598 

1.237 

-2.835 

.384 

4 

2.117 

1.008 

.840 

-2.964 

.313 

5 

1.811 

.863 

•  719 

.634 

-3.027 

.268 

6 

1.598 

.761 

.634 

.560 

.509 

-3.062 

.236 

7 

1.439 

.685 

•  571 

.504 

.458 

. 424  -3 . 082 

.213 

Although  the  minimizing  coefficients  are  not  given  above  for 
r  =  20,  except  when  v  =  1,  the  following  table  gives  values  of 
where 

A  O 

E(e  -e) 

Tlr(v)  =  lim  -  _Ilj?r  g — 

n  -*  co  E  (q  -Q ) 

'  n  ' 

—  A 

is  the  asymptotic  efficiency  of  0^  relative  to  0^  r  and,  as  discussed 
in  Cooke  (1979).>  ©n  is  the  estimator  of  the  form 


Yn  +  'MV*1-8'1)  "C  '"Vi5 

1=0 


with  asymptotically  smallest  mean  squared  error  and  is  the  best  estimator 
derived  until  now.  Also  tabulated  are  values  of  &2q  (v  ),  where 
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E(e  -ef 

s  (V)  =  iim  - n^-g 

r  n  -»  oo  E  (Y  -9 ) 

n 

to  illustrate  the  considerable  progress  which  has  been  made  in  finding 
better  estimators  of  Q  than  Yn  since  Robson  and  Whitlock’s  (1964) 
attempt  in  the  truncation  case. 


TaHe  2 

Efficiencies  Relative  to  the  Optimal  Estimator  Based  on  the  20 
Largest  Observations  and  Improvement  Over 


V 

1 

1/2 

1/3 

1/4 

1/5 

,Wv) 

.798 

.494 

.357 

.257 

.252 

6  20  ^ 

.525 

.278 

.185 

.143 

.120 

4.  Estimation  of  qp. 

When  cp  is  known  to  be  finite  and  is  the  parameter  of  interest,  for 

A  £ 

given  r  >  1  we  seek  the  estimator  of  the  form  cp  =  a.Y.  with 

’  i=l 

asymptotically  smallest  mean  squared  error. 

If  the  lower  tail  of  f  is  characterized  by  the  constant  v  in 
the  way  the  upper  tail  is  characterized  above,  then  the  minimizing  coeffi¬ 
cients  are  precisely  those  in  section  3  since,  if  ^,Xg, . . .  ,Xn  are  independent 
with  lower  bound  cp  and  v  characterizes  the  lower  tail  of  f,  then 
-X^, -Xg, . . . , -Xn  are  independent  with  upper  bound  -cp  and  the  upper  tail 
of  the  distribution  of  -X^  is  characterized  by  v.  Finally,  the  largest 
r  order  statistics  based  on  -X^, -Xg, . . . , -X  are  the  negatives  of  the 

smallest  r  order  statistics  based  on  Xn,X^, ...,X  . 

1'  2*  ’  n 


7 


References 


Cooke,  P.  J.  (1979).  Statistical  inference  for  bounds  of  random  variables. 
Biometrika.  To  appear. 

Gnedenko,  B.  (19^3 )•  Sur  la  distribution  limite  du  terme  maximum  d'une 
serie  aleatoire.  Ann.  Math. ,  k23-k^k. 

Robson,  D.S.  and  Whitlock,  J.H.  (I96U).  Estimation  of  a  truncation  point. 
Biometrika,  51,  33-39* 


8 


_ UNCLASSIFIED _ 

SECURITY  CLASSIFICATION  OF  THIS  PAGE  (Whan  D«l«  Entared) 


REPORT  DOCUMENTATION  PAGE 

READ  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

1-  «EPO«T  NUMBER  2.  GOVT  ACCESSION  NO. 

37 

3.  RECIPIENT'S  CATALOG  NUMBER 

4.  TITLE  (and  Subtllla) 

Optimal  Linear  Estimation  of  Bounds  of 

Random  Variables 

5-  TYPE  OF  REPORT  ft  PERIOD  COVERED 

TECHNICAL  REPORT 

6-  PERFORMING  ORG.  REPORT  NUMBER 

7.  AUTHOR/ J> 

Peter  Cooke 

a.  contract  or  grant  number/V) 

DAAG29-77  -G-OO3I 

3-  PERFORMING  ORGANIZATION  NAME  ANO  ADDRESS 

Department  of  Statistics 

Stanford  University 

Stanford^  CA  9^305 

»0.  PROGRAM  ELEMENT.  PROJECT.  TASK 
AREA  4  WORK  UNIT  NUMBERS 

P-1443 5 -M 

It.  CONTROLLING  OFFICE  NAME  AND  ADDRESS 

U.  S.  Army  Research  Office 

Post  Office  Box  12211 

Research  Triangle  Park.  NO  P770Q 

12.  REPORT  DATE 

September  24,  1979 

13.  NUMBER  OF  PAGES 

8 

14.  MONlTCPf Ai£NCV  NAME  4  AOOR£SS<i/  different  from  Controlling  Office) 

ts.  SECURITY  CLASS,  (of  thia  report) 

UNCLASSIFIED 

15».  DECLASSIFICATION/ DOWNGRADING 
SCHEDULE 

16.  D15TrU2LiTC  ;n.  57AT£m£H7  (of  cni*  R  a  port) 

approve:  FOR  PUBLIC  RELEASE:  distribution  unlimited. 

17.  DISTRIBUTra*  s^mTEMEMT  (of  :h«  *d*rract  entered  In  Block  20.  If  dlllerent  from  Report) 

18.  supplementary  *otes 

The  findings  in  this  report  are  not  to  be  construed  as  an  official  Department 
of  the  Army  position,  unless  so  designated  by  other  authorized  documents. 

This  report  partially  supported  under  Office  of  Naval  Research  Contract 

NOOOlL -7 6 -C -0475  (NR-OL2-267)  and  issued  as  Technical  Report  No.  276. 

19.  KEY  WORDS  'Continue  on  reverae  aide  1 1  aeceeamry  and  Identify  by  block  number) 

Linear  estimator;  Gnedenko's  condition;  Mean  squared  error; 

Asymptotic  relative  efficiency. 

20.  ABSTRACT  t  Continue  on  reverae  aide  if  neceeaary  and  Identify  by  block  number) 

PLEASE  SEE  REVERSE  SIDE 

DD  F0RM 

w  I  JAN  73 


1473  EDITION  OF  I  NOV  65  IS  OBSOLETE 
5 '  N  0 !  02-  Lr*  0 ;  4-  460 1 


UNCLASSIFIED 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  Data  Sntarad) 


UNCLASSIFIED 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  (Whan  Data  Entarad) 


OPTIMAL  LINEAR  ESTIMATION  OF  BOUNDS  OF  RANDOM  VARIABLES 

The  problem  of  estimating  the  bounds  of  random  variables  has  been 
discussed  in  Cooke  (1979).  Here  we  discuss  optimality  of  estimates 
when  the  data  is  censored  so  that  only'  the  r  largest  or  smallest  of 
the  observations  is  available  for  estimating  a  bound.  For  fixed  r 
we  find  the  linear  function  of  the  censored  data  which  is  the  optimal 
estimator  of  a  bound  in  the  sense  that,  when  the  sample  size  is  large, 
the  estimator  has  smallest  mean  squared  error  among  all  such  linear 
estimators.  Provided  r  is  not  close  to  one,  these  estimators  are 
almost  optimal  when  the  entire  sample  is  available  since,  for  example, 
when  estimating  an  upper  bound  and  the  sample  size  is  large,  the 
largest  few  observations  carry  most  of  the  information  about  the  bound. 
This  fact  is  illustrated  in  one  case. 


276/37 


S/N  0102-  LF-  014-6601 

UNCLASSIFIED _ 

SECURITY  CLASSIFICATION  OF  THIS  PAGEflWten  Data  Entarad) 


